Skip to content

Mermaid PR architecture-diff action (level-1 default, nested opt-in)#2

Merged
brovatten merged 27 commits into
mainfrom
feat/mermaid-architecture-diff
Jun 6, 2026
Merged

Mermaid PR architecture-diff action (level-1 default, nested opt-in)#2
brovatten merged 27 commits into
mainfrom
feat/mermaid-architecture-diff

Conversation

@brovatten
Copy link
Copy Markdown
Member

Replaces the webview/Playwright PNG approach with an inline Mermaid diagram GitHub renders natively in the PR comment.

  • Base = committed .codeboarding/analysis.json if present, else a full engine run on the base commit; head analyzed incrementally from it.
  • scripts/diff_to_mermaid.py diffs the two analyses and emits a colored graph LR (green added / yellow modified / red deleted, nodes + arrows).
  • Level 1 (flat) default; nested: true for depth>1 subgraphs.
  • scripts/run_local.sh for local iteration.

The test-self.yml workflow runs this action on this PR as a live test (remove before merge). This PR comment thread should get an architecture-diff comment shortly.

brovatten added 2 commits June 3, 2026 15:24
…t-in)

Replaces the webview/Playwright PNG approach with an inline Mermaid diagram
that GitHub renders natively in the PR comment — no image, no orphan branch,
no contents:write, and fork-friendly.

How it works:
- Resolve a base ("before") analysis: use the committed
  .codeboarding/analysis.json at the PR base if present, else generate one via
  a full engine run on the base commit.
- Analyze the PR head incrementally, seeded from the base (stable component
  ids), falling back to a full run on cache miss.
- scripts/diff_to_mermaid.py diffs the two analyses (name-based matching;
  relation label change => modified) and emits a graph LR with nodes colored
  via classDef/class and arrows via positional linkStyle: green added, yellow
  modified, red dashed deleted. Escaping, deleted-namespace keying, and a size
  guard (GitHub's ~500-edge / 50k-char cap -> changed-only or text fallback).

Rendering:
- Level 1 (flat, top-level) is the default — readable inline, never trips the
  size cap.
- nested: true draws depth>1 sub-components as subgraphs (leaf nodes filled,
  parent containers outlined). Optional --font-size/--node-padding/spacing emit
  an %%{init}%% directive to enlarge nodes.

scripts/run_local.sh mirrors the action for local iteration (fast diff-only or
full pipeline) and writes a browser HTML preview rendered with mermaid.js.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Architecture review · 7 components changed

graph LR
    n_Execution_Orchestrator["Execution Orchestrator"]
    n_Structural_Analysis_Engine["Structural Analysis Engine"]
    n_Diagram_Synthesis_Component["Diagram Synthesis Component"]
    n_Interaction_CTA_Provider["Interaction #amp; CTA Provider"]
    del_Environment_Context_Resolver["Environment #amp; Context Resolver"]
    del_Remote_Job_Orchestrator["Remote Job Orchestrator"]
    del_Artifact_Result_Processor["Artifact #amp; Result Processor"]
    n_Execution_Orchestrator -- "Passes snapshots to" --> n_Structural_Analysis_Engine
    n_Structural_Analysis_Engine -- "Provides diffs to" --> n_Diagram_Synthesis_Component
    n_Structural_Analysis_Engine -- "Supplies change data to" --> n_Interaction_CTA_Provider
    n_Diagram_Synthesis_Component -- "Integrates links into" --> n_Interaction_CTA_Provider
    n_Diagram_Synthesis_Component -- "calls" --> n_Structural_Analysis_Engine
    del_Environment_Context_Resolver -- "passes resolved repository URL and branch to in…" --> del_Remote_Job_Orchestrator
    del_Remote_Job_Orchestrator -- "hands off JSON response containing full job res…" --> del_Artifact_Result_Processor
    del_Artifact_Result_Processor -- "writes extracted documentation data to runner f…" --> del_Artifact_Result_Processor
    del_Remote_Job_Orchestrator -- "manages job configuration and polls for status…" --> del_Remote_Job_Orchestrator
    classDef added fill:#1f883d,stroke:#0b5d23,color:#ffffff;
    classDef modified fill:#bf8700,stroke:#7d4e00,color:#ffffff;
    classDef deleted fill:#cf222e,stroke:#82071e,color:#ffffff,stroke-dasharray:5 3;
    class n_Execution_Orchestrator,n_Structural_Analysis_Engine,n_Diagram_Synthesis_Component,n_Interaction_CTA_Provider added;
    class del_Environment_Context_Resolver,del_Remote_Job_Orchestrator,del_Artifact_Result_Processor deleted;
    linkStyle 0,1,2,3,4 stroke:#0b5d23,stroke-width:2px;
    linkStyle 5,6,7,8 stroke:#82071e,stroke-width:2px,stroke-dasharray:5 3;
Loading

Colours indicate components that have been 🟩 added · 🟨 modified · 🟥 removed — versus main.


🧭 See this architecture in your editor: Open in VS Code →

💡 New to CodeBoarding? Get the extension →

codeboarding-action · run 26942517446

brovatten added 11 commits June 3, 2026 16:37
…n editor' CTA

Move CTA building into scripts/build_cta.py (testable; docstring cites the 2025
SO survey: VS Code 75.9% + Cursor 17.9% ~= 94% justifies VS Code+Cursor only).
.vscode dir -> VS Code, .cursor -> Cursor, both -> both, neither -> VS Code.
depth_level drives the engine (deep, rich data); new render_depth controls how
many component levels the PR Mermaid draws (default 1 = top-level flat). Replaces
the nested boolean. Self-test now runs depth_level=2 + render_depth=1.
…rl), unit tests + CI

- engine_ref default main -> v0.12.0 (verified to have the run_full/run_incremental/
  health/StaticAnalysisCache API); reproducible default, overridable.
- README: engine_ref + cta_base_url rows; depth_level vs render_depth clarified.
- tests/: 21 stdlib unittest cases for diff_to_mermaid + build_cta.
- .github/workflows/test.yml runs them on push/PR (no LLM).
…cache

The manual .venv cache was restored then immediately wiped by 'uv venv --clear',
so it never helped. Reuse it when present and enable uv's download cache for cold installs.
Guard now resolves the PR from either a pull_request event or an issue_comment
'/codeboarding' command (comment body read from env, never interpolated -> no
injection; SHAs fetched via gh api). Reacts 👀 to acknowledge. Configurable via
trigger_command. Sticky comment + base_ref now use the resolved PR explicitly.
…e); CTA -> extension-direct

- docs/COMMIT_STRATEGY.md: commit analysis.json + health_report.json (text,
  display-critical); do NOT commit static_analysis.pkl (binary -> actions/cache).
  Forward-compatible with a future hosted webview.
- build_cta: drop the 'explore in browser' webview tier (deferred); footer now
  drives straight to the extension (open in VS Code/Cursor + get the extension).
…ing)

Security
- CRITICAL: gate the /codeboarding (issue_comment) path on author_association
  (OWNER/MEMBER/COLLABORATOR) before any checkout/analysis — closes the
  pwn-request/secret-exfiltration hole; README example + security note updated.
- Pass base_ref/header/cta_base via env (not interpolated into bash); extract
  engine python -c into scripts/cb_engine.py (no ${{ }}->python-source interp);
  preflight prints only the error message, not the whole auth body.

Bugs (diff_to_mermaid.py)
- Escape Mermaid shape metacharacters (] [ ( ) { } | < > &) — a bare ] no longer
  breaks GitHub rendering.
- n_changed is now recursive + a `changed` flag covers relation/nested-only
  changes; the action shows the diagram on `rendered` instead of suppressing it.
- Size guard counts drawn edges and RETRIES changed-only before giving up.
- _filter_changed ignores empty id/name strings.
- Deleted parent keeps its subtree (renders as a deleted subgraph, symmetric
  with added).

action.yml hardening
- Guard: set -uo pipefail + graceful gh-api failure + assert base/head SHAs.
- Generate-base: worktree remove/prune (not just rm -rf); cache the generated
  base keyed by base SHA (pay the cold full-analysis once).
- Seed static_analysis.sha with the pkl (enables the committed-baseline
  warm-start that was previously unreachable).
- Health step writes 0 first (no stale count on exception).
- Post a failure comment (if: failure()) so a failed run isn't silent.

Workflows
- release-major-tag: anchor the semver regex ($) so prereleases don't move vN.
- test-self + README example: concurrency cancel-in-progress.

Tests: +cb_engine smoke tests, escaping/changed-flag cases (31 total, all green).
@brovatten
Copy link
Copy Markdown
Member Author

/codeboarding

brovatten added 14 commits June 6, 2026 12:48
The cb-base cache stores a full LLM analysis whose output depends on
agent_model/parsing_model, but the key omitted them: re-running on the
same base SHA with a different model served a stale-model 'before'
snapshot, producing phantom added/modified/deleted components. Add both
models to the restore+save keys.

Drop the cb-uv restore-keys fallback: on a lockfile change it restored a
venv built from a different uv.lock, and 'uv pip install -e .' does not
reconcile already-installed transitive deps back to the lock, silently
running mismatched versions. Exact-key hits (unchanged lock) still warm-start.
…check

_diff_methods built full per-file sorted added/removed dicts, but only
the boolean 'any change?' answer was used — the method_diff payload was
attached to every component and never read by the renderer or any test.
Replace it with a short-circuiting _has_method_changes() (no sorting, no
payload), and fold _has_structural_changes' if/return ladder into one
boolean return. Behaviour-identical; 41 tests green.
…able fallback

Lock down three untested renderer/engine paths: _truncate's 48-char
ellipsis boundary, _esc stripping raw newlines/CR (a raw newline breaks
the whole Mermaid block), and run_head falling back to a full analysis on
BaselineUnavailableError (only IncrementalCacheMissingError was covered).
Lead with a rendered sample Mermaid diagram so readers see the actual
colored output (a visual tool should show, not tell). Drop the marketing
lead line, and fold the overlapping 'when it runs', 'on-demand', and
trailing security paragraphs into one section that states each fact
(auto-once vs /codeboarding, default-branch rule, fork refusal) exactly
once.
The agent_model/parsing_model defaults were a hardcoded 'openrouter/anthropic/
claude-sonnet-4' that OpenRouter rejects (400 invalid model id — the engine
sends the slug verbatim, so it must be a bare OpenRouter id like
anthropic/claude-sonnet-4, no 'openrouter/' prefix). Default both to empty and
export AGENT_MODEL/PARSING_MODEL only when set, so an unset model defers to the
engine's own valid per-provider default instead of a string that rots over time.
Users override via the inputs (optionally wired to repo secrets).
State where the OPENROUTER_API_KEY secret goes (and the env-var path for local
runs), and that agent_model/parsing_model are optional — omit for the engine's
per-provider default, or pin to a bare OpenRouter slug (inline or via secret).
…' slug

run_local.sh still defaulted AGENT_MODEL/PARSING_MODEL to openrouter/anthropic/
claude-sonnet-4 — the litellm-prefixed string that 400s against the engine's
OpenRouter ChatOpenAI path (the bug just fixed in action.yml). Default both to
empty and export only when set, so the local harness mirrors the action and an
unset model falls through to the engine's own valid per-provider default.
…at rule

A model name is non-sensitive config, so steer overrides to a repo variable
(vars.AGENT_MODEL) and keep secrets. for the key. State the format rule once:
bare OpenRouter slug, exactly one '/', no 'openrouter/' prefix.
The engine calls OpenRouter natively (langchain ChatOpenAI), so the model must
be a bare slug; the litellm 'openrouter/...' form 400s deep inside the engine.
Fail fast in the verify step with a one-line actionable error instead, the same
early-validation the API key already gets.
The diff step expanded "${CHANGED_ONLY[@]}" etc. directly; on macOS's bash
3.2 an empty array under 'set -u' errors as 'unbound variable', so the full
local pipeline crashed at diff->mermaid whenever the optional flags weren't
passed. Use the ${arr[@]+"${arr[@]}"} idiom (elements when set, nothing when
empty) so it works on bash 3.2 and 4+.
Add a 'Bring your own LLM provider' section: lead with the <NAME>_API_KEY
convention + one example, collapse the full provider table, and point at the
engine's registry (agents/llm_config.py) as the source of truth so the README
can't rot. Scope the 'bare OpenRouter slug' model-format note to OpenRouter.
The action hardcoded OPENROUTER_API_KEY + an OpenRouter-only preflight, so a
user with an OpenAI/Anthropic/etc. key couldn't use it. Add an optional
'llm_provider' input that maps the key to the engine's env var by the
<NAME>_API_KEY convention (with aws_bedrock/ollama exceptions); the engine then
auto-selects the provider. Gate the OpenRouter preflight and the 'openrouter/'
model-prefix guard behind provider==openrouter, export only the selected env
var in both engine steps, and add llm_provider to the base-cache key. Default
stays openrouter, so existing workflows are unchanged. No engine change.
@brovatten brovatten merged commit d7f12cb into main Jun 6, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant